Here are my top 10 dataviz 'static' plots from 2016. All plots were generated with R graphics, and graphics packages from CRAN, Bioconductor and github. My top 10 Static Data visualization from 2016

Motivation

Data here, data there data everywhere! Mountain of data, by itself, without visualization and context can not adequately provide meaningful insight. When it comes to data visualization, 2016 was a breakout year for me. There was massive amount of lessons in preparing, plotting and modeling data. It was challenging, frustrating, exhilarating, and dare I say, fun all a the same time. I created about 12 blogs**, published 4 documents on Rpub**, 1 Kaggle ‘kernel’, 11 shiny apps (no all deployed), several interactive and static maps. By no means I done it all or know it all, there is a lot more to learn!

Selection and ordering criteria is not scientific, I simply chose based of my own satisfaction level after the plot is rendered . The top 10 static data visualizations listed below were created for data projects I undertook in 2016. I have also added runner up data driven plots/maps/graphs at the end. Will add dynamic top 10 plots and favorite web application created for 2016 in separate blogs.

Majority of my time was spent learning, and applying what I learned, wrangling/cleaning/tydiying data. If you don’t love this part of it, it would make you question your existence. Fortunately I do! I can confirm that cleaning, in deed, takes 80% of the effort of organizing and arranging data! Once tidy data is achieved (“variable in columns/observations in rows”), applying statistics summarizations, modeling and visualization can become easier. Note: All of the figures listed here are in png format. The original including the codes are found in the blogs and or my github repo.

Without further ado - here are my static top 10 data visualization plot/grph/maps for 2016

Number 10: Highlight interesting data - Signal/Noise

This plot was generated for analysis identifying math and science score performance for black female students in specific socioeconomic groups. Number 10: Two data frames and colors were used to highlight interesting(signal) from the noise


Number 9: Student Science/math - Interactive facet

This plot uses three categorical data in one visual to shows the black female students score and the statistical trend where there is sufficient data. Number 9: ggplot2 and plotly working together; interactive categorical data

Number 8: Election Phone Bank 1

I participated in the US election in 2016 working in a phone banks and canvasing. I generated dummy data and created this slide that show the daily count voter that will need to be called back based on their request, or other factors. It identify voters based on gender, race and age. Number 8: 2016 Election phone bank daily counts by race/gender/age

Number 7: Election Phone Bank 2

This visual is a follow up to Number 8. It counts voters response (yes, not sure, no-response, no, call back) and brakes down the count by ethnicity, it also aggregates each category with light yellow bar. Here I layered png image and layered the two data frames for the bar charts. ggplot2 is the star here. Number 7: 2016 Election phone bank daily counts by race/gender/age- ggplot2 multiple data frames and layering image.

Number 6: US MSA geojson shape

I am currently working on a project that digs into the prevalence of cancer int eh US. Center for disease Control keeps statistics for all types of cancer based on Metropolitan Statistical Area’s of the US (MSA). This plot layers was created converting the the MSA shape from US Census to geojson and layering the geojson shapes on top of a US tile from openstreetmaps. Number 6:US Metropolitan Statistical Areas (MSA), a geographical region with a relatively high population density; geojson shape file; packages used: ggmap, leaflet.**

Number 5: Continent Subregions

The following is re-creation of an example from the help section, I like the regional brake down and coloring of the continent. tmap is the star here. Number 5:African continent categorically divided based on the subregion it belongs; package used: tmap

Number 4: District of Columbia wards geojson layering

This plot shows the 8 wards of the Washington, DC with json shapefiles layered on top of the opestreetmap tile. DC wards map

Number 3: 9.5 Million pixles

I took this picture with samsung smart phone fall of 2016. With deep colors it is rich with pixels. Read the image into R with a EBImage bioconductor package and converted the image into data frame resulting in 9.5 million pixels. This is big data above and beyond for a personal computer. plotting and R graphics rendered the image, one pixel at a time, only after 25 minuets or so the first time.
Number 3:Stress testing R graphics with big data. Converted one of my images to data frame. Dim (2x9.5 million). Pcakge used ggplot2

Number 2: Aircraft fatality

Generated this plot for Kaggle “kernell”, it packs four plots into one and used the new ggplot feature for annotation. The plot is a statistics for aircraft accident in the US from NTSB database. Also used factor order using forcats package for the bar charts. Created a theme and got fancy with fonts and colors. Number 2:US aircraft accident that caused fatality from 1948-2016. Data NTSB. graphics package used: ggplot2.

Number 1: New York Taxi traffic flow

This plot was generated while trying to plot a million taxi drop off points in NY city on a small foot print. Number 1:One million taxi flow data in New York City project. Data: NY City taxi commission. graphics package used: ggmap

Runner UP graphics for 2016.

No order or preference is here.

A. Random Forest Model plot

Random Forest Tree plot. Package used:

Random Forest Tree plot. Package used:

B. Font and color with xckd

Font and color testing with xckd package. Line chart with Subset of gapminder data.

Font and color testing with xckd package. Line chart with Subset of gapminder data.

C. DC, Maryland, Virginia (DMV) metropolitan county Population density Choroplether

Choroplethr population distribution map for DC, Virginia, Virginia

Choroplethr population distribution map for DC, Virginia, Virginia

D. DMV Airbnb space avaialble categorical locations

DMV Airbnb space categorical locations

DMV Airbnb space categorical locations

E: Food Truck USA

This visual counts the cities with the most number of food trucks in the US cities. Number 4:US cities with the most number of food trucks. image layering. packages used: ggplot2.

F: Women vs Men salary comparison for college professors.

Women vs Men salary comparison for college professors.

Women vs Men salary comparison for college professors.

Credit

Although I made the plots, I Want to take a moment to thank all the great minds who have created the packages used to create the plots, answered questions on stack overflow, github and twitter. Most of the heavy leafting was done by them!